drmr: A Bayesian approach to Dynamic Range Models in R

Available at lcgodoy.me/slides/2025-esa/

Lucas da Cunha Godoy

EEB Department, UCSC

2025-07-29

Introduction

The Challenge & Limits of SDMs

  • Critical Challenge: Predicting species’ responses to global environmental change is vital for conservation and management.

  • Usual Tool: Species Distribution Models (SDMs) have been the workhorse, correlating occurrences with environmental variables.

  • Limitations of SDMs:

    • Struggle to predict responses under novel future conditions (Pagel and Schurr 2012).
    • Lack Mechanism: They do not explicitly model the underlying biological processes.
    • Equilibrium Assumption: Often violated (Guisan and Thuiller 2005).

Dynamic Range Models (DRMs): A Mechanistic Approach

  • DRMs explicitly incorporate demographic processes that drive range dynamics (Pagel and Schurr 2012).

  • Allows linking environmental drivers directly to specific processes.

  • Potential for more robust forecasting under novel conditions.

  • Why are they rarely in practice? Despite conceptual appeal, DRMs have been underutilized due to their complexity and computational challenges to fit these models.

Introducing drmr: Making DRMs Accessible

  • Goal: Bridge the gap between DRM potential and practical application.

  • Key Features:

    • Makes age-structured DRMs readily available in a Bayesian framework.
    • Leverages Stan via cmdstanr for efficient fitting (Gabry et al. 2024).
    • User-friendly interface.
    • Easily relate environmental drivers to specific recruitment and survival.

Age-structured DRM

Setup & Notation

  • \(Y_{a, t, i}\): Unobserved density of individuals of age \(a\), time \(t\) and site \(i\).

  • \(Y_{t, i} = \sum_{a} Y_{a, t, i}\): Observed density of individuals (all ages) at time \(t\) and site \(i\).

  • \(\lambda_{a, t, i}\): Expected age-specific density.

    • Biological processes (recruitment, survival, and movement) are encoded through these parameters.
  • \(\mu_{t, i} = \sum_{a} \lambda_{a, t, i}\): Expected total density.

  • \(\rho_{t, i} = \mathrm{P}(Y_{t, i} = 0)\): Probability of absence.

Modeling assumption: Zero-augmented probability density function (pdf)

\[ f(y_{t, i} \mid \mu_{t, i}, \phi, \rho_{t, i}) = \begin{cases} \rho_{t, i}, & \text{ if } y_{t, i} = 0, \\ (1 - \rho_{t, i}) g(y_{t, i} \mid \mu_{t, i}, \phi), & \text{ if } y_{t, i} > 0. \end{cases} \]

  • \(y_{t, i}\) is a realization of \(Y_{t, i}\).

  • \(g(\cdot \mid \mu_{t, i}, \phi)\) is the pdf of a continuous probability distribution.

  • \(\phi\) is a nuisance parameter.

Let’s look at some model assumptions!

Demographic processes

graph TD
    subgraph "Demographic processes"
        %% Inputs and Parameters for lambda1
        xr["$$\mathbf{x}^{(r)}_{t, i}$$"] --> lambda1(("$$\lambda_{1, t, i}$$"));
        betar(("$$\boldsymbol{\beta}_r$$")) --> lambda1;
        zr(("$$z^{(r)}_{t, i}$$")) --> lambda1;

        %% Recursive part for lambda_a
        subgraph "Time t-1"
            xs["$$\mathbf{x}^{(s)}_{t - 1, i}$$"] --> s_prev;
            betas(("$$\boldsymbol{\beta}_s$$")) --> s_prev;
            zs(("$$z^{(s)}_{t - 1, i}$$")) --> s_prev;
            f["$$f_{a - 1, t - 1}$$"] --> s_prev;
            lambda_prev(("$$\lambda_{a - 1, t - 1, i}$$"))
            s_prev(("$$s_{a - 1, t - 1, i}$$"))
        end
        
        lambda_prev --> lambda_a(("$$\lambda_{a, t, i}$$"));
        s_prev --> lambda_a;
    end

    %% Mean calculation
    lambda_a --> mu(("$$\mu_{t, i}$$"));

subgraph "Zero-augmentation"
        xt["$$\mathbf{x}^{(p)}_{t, i}$$"] --> rho(("$$\rho_{t, i}$$"));
        betat(("$$\boldsymbol{\beta}_p$$")) --> rho;
    end

    %% Final Observation
    mu --> Y{"$$Y_{t, i}$$"};

    %% Other components
    phi(("$$\phi$$")) --> Y;
    rho --> Y;

Code & workflow

Illustration with real data

Summer Flounder Dataset

  • An example analysis uses Summer flounder (Paralichthys dentatus) data from 1982-2016 NOAA bottom trawl surveys (Fredston et al. 2025) to illustrate the package’s features.

  • The data spans the US Atlantic coast (Cape Hatteras, NC to the Canada/Maine border) and was aggregated from individual hauls into 10 latitudinal patches with varying areas.

  • Response variable: Density (count per unit area).

  • Environmental drivers: SST and SBT.

Models fitted to the data

Model Non-neg. PDF \(\rho_{i, t}\) Recruitment (or Density) Survival
DRM (rec) Gamma Effort + estimated density SST + SST2 + AR(1) Intercept + \(f_{a, t}\)
DRM (surv) Gamma Effort + estimated density AR(1) SBT + SBT2 + \(f_{a, t}\)
SDM (GLM) LogNormal Effort SST + SST2

Code for DRM (rec)

drm_rec <-
  fit_drm(.data = dat_train,
          y_col = "dens", ## response variable: density
          time_col = "year", ## vector of time points
          site_col = "patch",
          family = "gamma", ## other options: lognormal, loglogistic
          seed = 202505,
          formula_zero = ~ 1 + c_hauls,
          formula_rec = ~ 1 + c_sst + I(c_sst * c_sst),
          formula_surv = ~ 1,
          f_mort = f_train,
          n_ages = NROW(f_train),
          adj_mat = adj_mat, ## A matrix for movement routine
          ages_movement = c(0, 0,
                            rep(1, 12),
                            0, 0), ## ages allowed to move
          .toggles = list(ar_re = "rec", ## other options: "surv", "dens"
                          movement = 1,
                          est_surv = 1,
                          est_init = 0,
                          minit = 1),
          .priors = list(pr_phi_a = 1, pr_phi_b = .1,
                         pr_alpha_a = 4.2, pr_alpha_b = 5.8,
                         pr_zeta_a = 7, pr_zeta_b = 3))

Code for DRM (surv)

drm_surv <-
  fit_drm(.data = dat_train,
          y_col = "dens", ## response variable: density
          time_col = "year", ## vector of time points
          site_col = "patch",
          family = "gamma", ## other options: lognormal, loglogistic
          seed = 202505,
          formula_zero = ~ 1 + c_hauls,
          formula_rec = ~ 1,
          formula_surv = ~ 1 + c_sbt + I(c_sst * c_sbt),
          f_mort = f_train,
          n_ages = NROW(f_train),
          adj_mat = adj_mat, ## A matrix for movement routine
          ages_movement = c(0, 0,
                            rep(1, 12),
                            0, 0), ## ages allowed to move
          .toggles = list(ar_re = "rec", ## other options: "surv", "dens"
                          movement = 1,
                          est_surv = 1,
                          est_init = 0,
                          minit = 1),
          .priors = list(pr_phi_a = 1, pr_phi_b = .1,
                         pr_alpha_a = 4.2, pr_alpha_b = 5.8,
                         pr_zeta_a = 7, pr_zeta_b = 3))

Effect of environmental variables on specific processes

  • Recruitment and SST:
newdata_rec <- data.frame(c_sst =
                            seq(from = quantile(dat_train$c_sst, .05),
                                to = quantile(dat_train$c_sst, .95),
                                length.out = 200))

rec_samples <- marg_rec(drm_rec, newdata_rec)
  • Survival and SBT:
newdata_surv <- data.frame(c_sbt =
                             seq(from = quantile(dat_train$c_sbt, .05),
                                 to = quantile(dat_train$c_sbt, .95),
                                 length.out = 200))

surv_samples <- marg_surv(drm_surv, newdata_surv)

Effect of environmental variables

Forecasting

  • The predict_drm function requires at least three arguments:
    • drm: Output of a fit_drm call
    • new_data: A data.frame where we seek to obtain predictions
    • seed: For reproducibility
forecast_rec <- predict_drm(drm = drm_rec,                         # Required
                            new_data = dat_test,                   # Required
                            seed = 152,                            # Required
                            f_test = f_test,                       # Optional
                            past_data = filter(dat_train,          # Optional
                                               year == max(year)))
  • When \(f_{a, t}\) is used at the time of model fit, we also need to provide f_test to predict_drm.

  • past_data is necessary when survival is a function of environmental variables.

Forecasts visualization & comparison

Concluding remarks

Highlights

  • The drmr substantially lowers the barrier for ecologists to use the DRM in their applications.

  • The code is easy to use and takes advantage of what has been developed for Stan: visualization, diagnostic tools, and estimation.

  • drmr allows for empirically testing which processes are more important to predict the distribution of a species.

  • The more complex a model is, the more (and better) data we need to be able to estimate those relationships.

Future work

  • Include population dynamic models that explicitly relate adult density to recruitment (e.g., Ricker, Belverton-Holt)

  • GAM-like non-parametric relationships between processes and the environment

  • Support for length-composition data

  • More realistic movement routines

Acknowledgements

  • Malin, Alexa, Jude, Jeewantha, Mark, and many others.

References

Fredston, A., Ovando, D., Godoy, L. da C., Kong, J., Muffley, B., Thorson, J. T., and Pinsky, M. (2025), “Dynamic range models improve the near-term forecast for a marine species on the move,” Ecology Letters, 00, (under review). https://doi.org/10.32942/X24D00.
Gabry, J., Češnovar, R., Johnson, A., and Bronder, S. (2024), cmdstanr: R interface to CmdStan.
Guisan, A., and Thuiller, W. (2005), “Predicting species distribution: Offering more than simple habitat models,” Ecology letters, Wiley Online Library, 8, 993–1009. https://doi.org/https://doi.org/10.1111/j.1461-0248.2005.00792.x.
Pagel, J., and Schurr, F. M. (2012), “Forecasting species ranges by statistical estimation of ecological niches and spatial population dynamics,” Global Ecology and Biogeography, 21, 293–304. https://doi.org/https://doi.org/10.1111/j.1466-8238.2011.00663.x.